Self-Driving Car Engineer Nanodegree

Deep Learning

Project: Build a Traffic Sign Recognition Classifier

In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with 'Implementation' in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with 'Optional' in the header.

In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a 'Question' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.

Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.


Step 1: Dataset Exploration

Visualize the German Traffic Signs Dataset. This is open ended, some suggestions include: plotting traffic signs images, plotting the count of each sign, etc. Be creative!

The pickled data is a dictionary with 4 key/value pairs:

  • features -> the images pixel values, (width, height, channels)
  • labels -> the label of the traffic sign
  • sizes -> the original width and height of the image, (width, height)
  • coords -> coordinates of a bounding box around the sign in the image, (x1, y1, x2, y2). Based the original image (not the resized version).
In [1]:
"""Import packages"""
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import math
import os
import cv2
import tensorflow as tf
import time

%matplotlib inline
In [2]:
import pickle

# TODO: fill this in based on where you saved the training and testing data
training_file = '/Users/blakejacquot/Dropbox/MOOCs/Udacity_SelfDrivingCar/Term1/TrafficSignClassifier/traffic-signs-data/train.p'
testing_file = '/Users/blakejacquot/Dropbox/MOOCs/Udacity_SelfDrivingCar/Term1/TrafficSignClassifier/traffic-signs-data/test.p'


with open(training_file, mode='rb') as f:
    train = pickle.load(f)
with open(testing_file, mode='rb') as f:
    test = pickle.load(f)
    
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']
In [3]:
### To start off let's do a basic data summary.
labels = {}
for el in y_train:
    if el in labels.keys():
        labels[el] += 1
    else:
        labels[el] = 1
        
print(labels.keys())

# TODO: number of training examples
n_train = len(y_train)

# TODO: number of testing examples
n_test = len(y_test)

# TODO: what's the shape of an image?
single_image = X_train[0][:][:][:]
image_shape = single_image.shape

# TODO: how many classes are in the dataset
n_classes = len(labels.keys())

print(' ')
print('X_train shape = ', X_train.shape)
print('y_train shape = ', y_train.shape)
print('X_test shape = ', X_test.shape)
print('y_test shape = ', y_test.shape)
print("Type of X_train = ", type(X_train))
print("Type of y_train = ", type(y_train))
print(' ')
print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)
dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42])
 
X_train shape =  (39209, 32, 32, 3)
y_train shape =  (39209,)
X_test shape =  (12630, 32, 32, 3)
y_test shape =  (12630,)
Type of X_train =  <class 'numpy.ndarray'>
Type of y_train =  <class 'numpy.ndarray'>
 
Number of training examples = 39209
Number of testing examples = 12630
Image data shape = (32, 32, 3)
Number of classes = 43
In [4]:
### Data exploration visualization goes here.
### Feel free to use as many code cells as needed.
In [5]:
"""Helper functions for data categorization and exploration"""

def make_class_dict(y):
    class_dict = {}
    num_el = len(y)
    for i in range(num_el):
        curr_class = y[i]
        if curr_class not in class_dict.keys():
            class_dict[curr_class] = [i]
        else:
            pos_index = class_dict[curr_class]
            pos_index.append(i)
            class_dict[curr_class] = pos_index
    return class_dict

import random
def plot_random(X, class_dict):
    for curr_class in class_dict.keys():
        pos_index = class_dict[curr_class]
        len_index = len(pos_index)
        i1 = random.randrange(len_index)
        i2 = random.randrange(len_index)
        i3 = random.randrange(len_index)
        i4 = random.randrange(len_index)
        i5 = random.randrange(len_index)
        i6 = random.randrange(len_index)
        i7 = random.randrange(len_index)
        i8 = random.randrange(len_index)
        i9 = random.randrange(len_index)
        print('Current class = ' + str(curr_class))
        index1 = pos_index[i1]
        index2 = pos_index[i2]
        index3 = pos_index[i3]
        index4 = pos_index[i4]
        index5 = pos_index[i5]
        index6 = pos_index[i6]
        index7 = pos_index[i7]
        index8 = pos_index[i8]
        index9 = pos_index[i9]

        im1 = X[index1][:][:][:]
        im2 = X[index2][:][:][:]
        im3 = X[index3][:][:][:]
        im4 = X[index4][:][:][:]
        im5 = X[index5][:][:][:]
        im6 = X[index6][:][:][:]
        im7 = X[index7][:][:][:]
        im8 = X[index8][:][:][:]
        im9 = X[index9][:][:][:]
     
        plt.figure()
        plt.subplot(331)
        plt.imshow(im1)
        plt.subplot(332)
        plt.imshow(im2)
        plt.subplot(333)
        plt.imshow(im3)

        plt.subplot(334)
        plt.imshow(im4)
        plt.subplot(335)
        plt.imshow(im5)
        plt.subplot(336)
        plt.imshow(im6)
        
        plt.subplot(337)
        plt.imshow(im7)
        plt.subplot(338)
        plt.imshow(im8)
        plt.subplot(339)
        plt.imshow(im9)
        
        plt.show() 
    plt.close("all")
    
In [6]:
"""Organize images into dictionaries"""
class_dict_train = make_class_dict(y_train)
class_dict_test = make_class_dict(y_test)
In [33]:
"""Display random images from training set"""
plot_random(X_train, class_dict_train)
Current class = 0
Current class = 1
Current class = 2
Current class = 3
Current class = 4
Current class = 5
Current class = 6
Current class = 7
Current class = 8
Current class = 9
Current class = 10
Current class = 11
Current class = 12
Current class = 13
Current class = 14
Current class = 15
Current class = 16
Current class = 17
Current class = 18
Current class = 19
Current class = 20
Current class = 21
Current class = 22
Current class = 23
Current class = 24
Current class = 25
Current class = 26
Current class = 27
Current class = 28
Current class = 29
Current class = 30
Current class = 31
Current class = 32
Current class = 33
Current class = 34
Current class = 35
Current class = 36
Current class = 37
Current class = 38
Current class = 39
Current class = 40
Current class = 41
Current class = 42
In [34]:
"""Display random images from testing set"""
plot_random(X_test, class_dict_test)
Current class = 0
Current class = 1
Current class = 2
Current class = 3
Current class = 4
Current class = 5
Current class = 6
Current class = 7
Current class = 8
Current class = 9
Current class = 10
Current class = 11
Current class = 12
Current class = 13
Current class = 14
Current class = 15
Current class = 16
Current class = 17
Current class = 18
Current class = 19
Current class = 20
Current class = 21
Current class = 22
Current class = 23
Current class = 24
Current class = 25
Current class = 26
Current class = 27
Current class = 28
Current class = 29
Current class = 30
Current class = 31
Current class = 32
Current class = 33
Current class = 34
Current class = 35
Current class = 36
Current class = 37
Current class = 38
Current class = 39
Current class = 40
Current class = 41
Current class = 42
In [7]:
classes = []
num_entries = []
for key in class_dict_train:
    curr_num_entries = len(class_dict_train[key])
    print('Class %02d has %04d entries' % (key,curr_num_entries))
    classes.append(key)
    num_entries.append(curr_num_entries)

plt.bar(classes,num_entries)
plt.show()
Class 00 has 0210 entries
Class 01 has 2220 entries
Class 02 has 2250 entries
Class 03 has 1410 entries
Class 04 has 1980 entries
Class 05 has 1860 entries
Class 06 has 0420 entries
Class 07 has 1440 entries
Class 08 has 1410 entries
Class 09 has 1470 entries
Class 10 has 2010 entries
Class 11 has 1320 entries
Class 12 has 2100 entries
Class 13 has 2160 entries
Class 14 has 0780 entries
Class 15 has 0630 entries
Class 16 has 0420 entries
Class 17 has 1110 entries
Class 18 has 1200 entries
Class 19 has 0210 entries
Class 20 has 0360 entries
Class 21 has 0330 entries
Class 22 has 0390 entries
Class 23 has 0510 entries
Class 24 has 0270 entries
Class 25 has 1500 entries
Class 26 has 0600 entries
Class 27 has 0240 entries
Class 28 has 0540 entries
Class 29 has 0270 entries
Class 30 has 0450 entries
Class 31 has 0780 entries
Class 32 has 0240 entries
Class 33 has 0689 entries
Class 34 has 0420 entries
Class 35 has 1200 entries
Class 36 has 0390 entries
Class 37 has 0210 entries
Class 38 has 2070 entries
Class 39 has 0300 entries
Class 40 has 0360 entries
Class 41 has 0240 entries
Class 42 has 0240 entries

Step 2: Design and Test a Model Architecture

Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the German Traffic Sign Dataset.

There are various aspects to consider when thinking about this problem:

  • Your model can be derived from a deep feedforward net or a deep convolutional network.
  • Play around preprocessing techniques (normalization, rgb to grayscale, etc)
  • Number of examples per label (some have more than others).
  • Generate fake data.

Here is an example of a published baseline model on this problem. It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.

Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [8]:
### Preprocess the data here.
### Feel free to use as many code cells as needed.
In [9]:
"""Define helper functions for pre-processing"""

def grayscale_singleimage(img):
    """Applies the Grayscale transform to single image.
    
    This will return an image with only one color channel
    but NOTE: to see the returned image as grayscale
    you should call plt.imshow(gray, cmap='gray')
    
    Args:
        img: numpy array with dimensions [x,y]
    Returns
        Numpy array with dimensions [x, y]
    
    """
    return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

def grayscale_set(x):
    print('Making grayscale')
    x_shape = x.shape
    num_el = x_shape[0]
    ret_images = np.ones((x_shape[0],x_shape[1],x_shape[2]))
    for i in range(num_el):
        curr_im = x[i,:,:,:]
        ret_images[i,:,:] = grayscale_singleimage(curr_im)
    return ret_images

def normalize_set(x):
    print('Normalizing data')
    x_shape = x.shape
    num_el = x_shape[0]
    ret_images = np.ones((x_shape[0],x_shape[1],x_shape[2]))
    for i in range(num_el):
        curr_im = x[i][:][:][:]
        empty_im = np.ones((x_shape[1],x_shape[2]))
        #proc_im = cv2.normalize(src=curr_im, dst=empty_im, alpha=0.1, beta=0.9, norm_type=cv2.NORM_MINMAX)
        proc_im = cv2.normalize(src=curr_im, dst=empty_im, alpha=-1.,beta=1.,norm_type=cv2.NORM_MINMAX)
        ret_images[i][:][:] = proc_im
    return ret_images

def make_one_hot_encoding(y, num_labels):
    print('Making one hot encoding')
    y_shape = y.shape
    numel = y_shape[0]
    ret_y = np.zeros((numel, num_labels))
    for i in range(numel):
        curr_label = y[i]
        #print('Current label = ', curr_label)
        curr_encoding = np.zeros(num_labels)
        for j in range(num_labels):
            if j == int(curr_label):
                #print('Match!', j, curr_label)
                curr_encoding[j] = 1.0
        #print('Print one-hot encoding of label = ', curr_encoding)
        ret_y[i] = curr_encoding
    return ret_y
In [10]:
"""Preprocess data"""
# Preprocess test data
X_test_preproc = X_test
X_test_preproc = grayscale_set(X_test_preproc)
X_test_preproc = normalize_set(X_test_preproc)
test_data = X_test_preproc

# Preprocess training data
X_train_preproc = X_train
X_train_preproc = grayscale_set(X_train_preproc)
X_train_preproc = normalize_set(X_train_preproc)
train_data = X_train_preproc

"""Print data info"""
print(' ')
print('Data info')
print('Shape of training data = ', train_data.shape)
print('Shape of test data = ', test_data.shape)

train_labels = y_train
test_labels = y_test
print('y_train shape = ', y_train.shape)
print('y_test shape = ', y_test.shape)


train_labels = make_one_hot_encoding(y_train, 43)
test_labels = make_one_hot_encoding(y_test, 43)


# Must make the one-hot encoding float32 so values can be multiplied agains features in TensorFlow, which are float 32
# This is maddening!!!!! I spent HOURS chasing this down. My values were initially in float64, which didn't work.
# The model would train and not through warnings or errors, but accuracy never rose above 10%
# After re-casting in float32, everything worked just fine.
# Wish TensorFlow would at least give a warning if input is not in float32.
train_labels = train_labels.astype(np.float32)
test_labels = test_labels.astype(np.float32)

"""Expand data to have extra channel in order to be compatible with TensorFlow"""
if len(train_data.shape) != 4:
    print('Expanding data')
    train_data = np.expand_dims(np.array(train_data),3)

if len(test_data.shape) != 4:
    print('Expanding data')
    test_data = np.expand_dims(np.array(test_data),3)

print('Train data shape = ', train_data.shape)
print('Test data shape = ', test_data.shape)
Making grayscale
Normalizing data
Making grayscale
Normalizing data
 
Data info
Shape of training data =  (39209, 32, 32)
Shape of test data =  (12630, 32, 32)
y_train shape =  (39209,)
y_test shape =  (12630,)
Making one hot encoding
Making one hot encoding
Expanding data
Expanding data
Train data shape =  (39209, 32, 32, 1)
Test data shape =  (12630, 32, 32, 1)

Question 1

Describe the techniques used to preprocess the data.

Answer: Preprocessing occurred in 4 steps: Grayscale -> Normalize -> One-hot encoding -> Expand dimensions.

Grayscale: Convert each image in data set from color to grayscale. Note, this changes data from numpy array with dimensions (n, 32, 32, 3) to (n, 32, 32).

Normalize: Change range of grayscale images from (0,256) to (-1.0, +1.0)

One-hot encoding: Rather than integer or float labels for each data type (e.g. 1, 1.0, 3, 3.0), convert to a more binary representation so matrix math works (e.g. 3 becomes [0,0,1,0], 4 becomes [0,0,0,1]). The one-hot encoded version must by numpy array. This step is relatively simple. However there is a completely maddening part of this! TensorFlow expects the data type to be float32 so it can multiply by other float32 values internally. If you don't cast it as float32 and instead leave it cast as float64 (which many methods will do), TensorFlow will accept it and not through an error but all the optimization doesn't work well. I had my one-hot encoding initially cast as float64 and could never get accuracy to exceed 10% no matter how many epochs. When I recast as float32, I got better than 90% accuracy.

Expand dimensions: TensorFlow expects color channels for images. So, it won't accept numpy arrays with dimensions (n, 32, 32). Expand this to (n, 32, 32, 1) so it works, even though the last dimension is meaningless in terms of information.

In [11]:
### Generate data additional (if you want to!)
### and split the data into training/validation/testing sets here.
### Feel free to use as many code cells as needed.

from sklearn.model_selection import train_test_split

"""Make a validation set from some of training data"""
print('Train shape before validation split', train_data.shape, train_labels.shape)
train_data, validation_data, train_labels, validation_labels = train_test_split(train_data, train_labels,test_size=0.05, random_state=101)
print('Train shape after validation split', train_data.shape, train_labels.shape)
Train shape before validation split (39209, 32, 32, 1) (39209, 43)
Train shape after validation split (37248, 32, 32, 1) (37248, 43)
In [12]:
"""Report General statistics on training, testing, validation data sets"""
# Want training, test split to be 80%, 20% of total data
# Of the remaining training data, want training, validation split to be 80%, 20%

num_train = train_labels.shape[0]
num_test = test_labels.shape[0]
num_validation = validation_labels.shape[0]
total_samples = num_train + num_test + num_validation

print('Testing as percentage of whole = %f' % (num_test/total_samples))
print('Training as percentage of whole = %f' % (num_train/total_samples))
print('Validation as percentage of whole = %f' % (num_validation/total_samples))
Testing as percentage of whole = 0.243639
Training as percentage of whole = 0.718532
Validation as percentage of whole = 0.037829
In [13]:
print('Train data shape = ', train_data.shape)
print('Validation data shape = ', validation_data.shape)
print('Test data shape = ', test_data.shape)
Train data shape =  (37248, 32, 32, 1)
Validation data shape =  (1961, 32, 32, 1)
Test data shape =  (12630, 32, 32, 1)

Question 2

Describe how you set up the training, validation and testing data for your model. If you generated additional data, why?

Answer: I left 'test' as it was imported, which is ~25% of total data. For the 'Training' set, I split off 5% of the data. The ideal split would be closer to 20% 'Test' and 80% 'Training'. Of the 'Training' data, split off 5% for 'Validation'.

In [14]:
### Define your architecture here.
### Feel free to use as many code cells as needed.
In [15]:
"""CNN helper functions"""
def conv2d(x, W, b, strides=1):
    x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
    x = tf.nn.bias_add(x, b)
    return tf.nn.tanh(x)


def maxpool2d(x, k=2):
    return tf.nn.max_pool(
        x,
        ksize=[1, k, k, 1],
        strides=[1, k, k, 1],
        padding='SAME')

def conv_net(x, weights, biases):
    # Layer 1
    conv1 = conv2d(x, weights['layer_1'], biases['layer_1'])
    conv1 = maxpool2d(conv1)

    # Layer 2
    conv2 = conv2d(conv1, weights['layer_2'], biases['layer_2'])
    conv2 = maxpool2d(conv2)

    # Layer 3
    conv3 = conv2d(conv2, weights['layer_3'], biases['layer_3'])
    conv3 = maxpool2d(conv3)

    # Fully connected layer
    # Reshape conv3 output to fit fully connected layer input
    fc1 = tf.reshape(
        conv3,
        [-1, weights['fully_connected'].get_shape().as_list()[0]])
    fc1 = tf.add(
        tf.matmul(fc1, weights['fully_connected']),
        biases['fully_connected'])
    fc1 = tf.nn.tanh(fc1)

    # Output Layer - class prediction
    out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
    return out
In [16]:
"""Define neural network architecture"""
n_input = 1024  # traffic sign data input (Shape: 32*32)
n_classes = 43  # total number of traffic sign classes

layer_width = {
    'layer_1': 32,
    'layer_2': 64,
    'layer_3': 128,
    'fully_connected': 512
}

# Store layers weight & bias
weights = {
    'layer_1': tf.Variable(tf.truncated_normal(
        [5, 5, 1, layer_width['layer_1']],stddev=1e-3)),
    'layer_2': tf.Variable(tf.truncated_normal(
        [5, 5, layer_width['layer_1'], layer_width['layer_2']],stddev=1e-3)),
    'layer_3': tf.Variable(tf.truncated_normal(
        [5, 5, layer_width['layer_2'], layer_width['layer_3']],stddev=1e-3)),
    'fully_connected': tf.Variable(tf.truncated_normal(
        [2048, layer_width['fully_connected']],stddev=1e-3)),
    'out': tf.Variable(tf.truncated_normal(
        [layer_width['fully_connected'], n_classes],stddev=1e-3))
}
biases = {
    'layer_1': tf.Variable(tf.zeros(layer_width['layer_1'])),
    'layer_2': tf.Variable(tf.zeros(layer_width['layer_2'])),
    'layer_3': tf.Variable(tf.zeros(layer_width['layer_3'])),
    'fully_connected': tf.Variable(tf.zeros(layer_width['fully_connected'])),
    'out': tf.Variable(tf.zeros(n_classes))
}

Question 3

What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.) For reference on how to build a deep neural network using TensorFlow, see Deep Neural Network in TensorFlow from the classroom.

Answer: I have 3 convnets followed by fully connected layer, similar to the provided example. In a separate file 'Jacquot_Traffic_Signs_Recognition_2' I used 3 convnets with dropout and relu. But there I could never get the accuracy about ~8%.

In [17]:
### Train your model here.
### Feel free to use as many code cells as needed.
In [18]:
# Set up variables for logging and displaying data
time_per_training_epoch = []
training_accuracy_validation = []
training_accuracy_testing = []
cost_list = []
stat_filename = 'save.p'

"""Initialize or load pickled file containing status data on training"""
    
def initialize_stat_file(stat_filename):
    print('Saving stats to file')
    data_to_save = {'time_per_training_epoch': [], 
                    'training_accuracy_validation': [], 
                    'cost_list': []}
    pickle.dump(data_to_save, open( stat_filename, "wb" ))

def load_stats_from_file(stat_filename):
    print('Loading stats from file')
    data_to_save = pickle.load(open( stat_filename, "rb" ))
    time_per_training_epoch = data_to_save['time_per_training_epoch']
    training_accuracy_validation = data_to_save['training_accuracy_validation']
    cost_list = data_to_save['cost_list']
    
def save_stats_to_file(stat_filename):
    print('Saving stats to file')
    data_to_save = {'time_per_training_epoch': time_per_training_epoch, 
                    'training_accuracy_validation': training_accuracy_validation, 
                    'cost_list': cost_list}
    pickle.dump(data_to_save, open( stat_filename, "wb" ))

def report_stats_from_file(stat_filename):
    print('Reporting from file to make plot')
    
    """Retrieve data"""
    unpickled_data = pickle.load( open(stat_filename, "rb" ) )
    time_per_training_epoch = unpickled_data['time_per_training_epoch']
    training_accuracy_validation = unpickled_data['training_accuracy_validation']
    cost_list = unpickled_data['cost_list']

    """Plot and save fig for time per training epoch"""
    fig_savename = 'time_per_training_epoch.png'
    fig = plt.figure()
    curr_plot1 = plt.plot(range(len(time_per_training_epoch)), time_per_training_epoch, color = 'b')
    plt.ylabel('Time per training epoch (sec)')
    plt.title('Training time per epoch', fontsize = 10)
    plt.show()
    curr_dir = os.getcwd()
    fig.savefig(fig_savename)
    fig.clf()

    """Plot and save fig for cost"""
    fig_savename = 'cost_list.png'
    fig = plt.figure()
    curr_plot1 = plt.plot(range(len(cost_list)), cost_list, color = 'b')
    plt.ylabel('Cost_list')
    plt.title('Cost_list', fontsize = 10)
    plt.show()
    curr_dir = os.getcwd()
    fig.savefig(fig_savename)
    fig.clf()

    """Plot and save fig for validation model accuracy"""
    fig_savename = 'training_accuracy_validation.png'
    fig = plt.figure()
    curr_plot1 = plt.plot(range(len(training_accuracy_validation)), training_accuracy_validation, color = 'b')
    plt.ylabel('Training accuracy validation')
    plt.title('Training accuracy validation', fontsize = 10)
    plt.show()
    curr_dir = os.getcwd()
    fig.savefig(fig_savename)
    fig.clf()

    print('Done reporting stats')
    plt.close("all")
    
In [19]:
logfile = 'log.txt'
f = open(logfile,'w')
f.write('Log file for training session\n') # python will convert \n to os.linesep
f.close() # you can omit in most cases as the destructor will call it

initialize_stat_file(stat_filename)
load_stats_from_file(stat_filename)
save_stats_to_file(stat_filename)
#report_stats_from_file(stat_filename)
Saving stats to file
Loading stats from file
Saving stats to file
In [20]:
# Parameters
batch_size = 128 
training_epochs = 20

# tf Graph input
x = tf.placeholder("float", [None, 32, 32,1])
y = tf.placeholder("float", [None, n_classes])

logits = conv_net(x, weights, biases)

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, y))
optimizer = tf.train.AdamOptimizer().minimize(cost)
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

#Save the model after training
saver = tf.train.Saver()
    
# Initializing the variables
init = tf.initialize_all_variables()

#Save the model after training
saver = tf.train.Saver()

print(train_data.shape, validation_data.shape, test_data.shape)
(37248, 32, 32, 1) (1961, 32, 32, 1) (12630, 32, 32, 1)
In [21]:
model_savename = 'model-checkpoint'
In [22]:
restore_model_for_continued_work = 0
initialize_stat_file(stat_filename)

# Launch the graph
with tf.Session() as sess:
    print('Beginning to train')
    sess.run(init)
    
    """Restore saved model for continued work"""
    if restore_model_for_continued_work == 1:
        print('Restoring model')
        start_time_restore = time.time()
        saver = tf.train.import_meta_graph('model-checkpoint.meta')
        saver.restore(sess, 'model-checkpoint')
        all_vars = tf.trainable_variables()
        elapsed_time = time.time() - start_time_restore
        print('Time to restore model (sec) = ', int(elapsed_time))

    # Training cycle
    for epoch in range(training_epochs):
        print(' ')
        print('Starting epoch %02d' % epoch)
        epoch_start_time = time.time()
        total_batch = int(math.ceil(len(train_data)/batch_size))
        print ('Number of batches to process = %0000d' % total_batch)
        with open(logfile, "a") as myfile:
            str_to_write = 'Batches per epoch, batch size, total_samples in training set: '
            str_to_write += str(total_batch) + ' ' + str(batch_size) +  ' ' + str(total_samples) + '\n'
            myfile.write(str_to_write)     
        # Loop over all batches
        print('Processing batches. Not yet saving.')
        
        for i in range(total_batch):
            batch_start = i * batch_size;
            batch_x = train_data[batch_start:batch_start+batch_size]
            batch_y = train_labels[batch_start:batch_start+batch_size]
            #start = i * batch_size
            #stop  = i * batch_size + batch_size
            #batch_x = training_data[start:stop, :, :]
            #batch_y = training_labels[start:stop]
            # Run optimization op (backprop) and cost op (to get loss value)            
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
            #acc = 0
            acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y})
            #acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y, keep_prob: 1.})
            with open(logfile, "a") as myfile:
                str_to_write = 'Minibatch accuracy (epoch, iteration, accuracy): ' + str(epoch) + ' ' + str(i) + ' ' + str(acc) + '\n'
                myfile.write(str_to_write)            
        print('Done processing batches. Checking accuracy and logging epoch results')
        # Display logs per epoch step
        c = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
        accuracy_validation = sess.run(accuracy, feed_dict={x: validation_data, y: validation_labels})
        print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c)," validation accuracy=","{:,.3f}".format(accuracy_validation))
        epoch_stop_time = time.time()
        elapsed_time = epoch_stop_time - epoch_start_time
        print('Time to process epoch = ', int(elapsed_time))

        # Load stats, append, values, save, and report
        load_stats_from_file(stat_filename)        
        cost_list.append(c)        
        time_per_training_epoch.append(int(elapsed_time))
        training_accuracy_validation.append(accuracy_validation)                
        save_stats_to_file(stat_filename)
    print("Optimization Finished!")

    # Save model
    save_path = saver.save(sess, model_savename)
    print("Model saved in file: %s" % save_path)
    
    # Calculate accuracy
    print("Accuracy:",accuracy.eval({x: test_data, y: test_labels}))
Saving stats to file
Beginning to train
 
Starting epoch 00
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0001 cost= 0.471262217  validation accuracy= 0.865
Time to process epoch =  247
Loading stats from file
Saving stats to file
 
Starting epoch 01
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0002 cost= 0.088928156  validation accuracy= 0.974
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 02
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0003 cost= 0.026471311  validation accuracy= 0.984
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 03
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0004 cost= 0.010795375  validation accuracy= 0.988
Time to process epoch =  246
Loading stats from file
Saving stats to file
 
Starting epoch 04
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0005 cost= 0.002764778  validation accuracy= 0.994
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 05
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0006 cost= 0.001648130  validation accuracy= 0.994
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 06
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0007 cost= 0.000890159  validation accuracy= 0.995
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 07
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0008 cost= 0.000627030  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 08
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0009 cost= 0.000480835  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 09
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0010 cost= 0.000383650  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 10
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0011 cost= 0.000312303  validation accuracy= 0.995
Time to process epoch =  246
Loading stats from file
Saving stats to file
 
Starting epoch 11
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0012 cost= 0.000257707  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 12
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0013 cost= 0.000214586  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 13
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0014 cost= 0.000179952  validation accuracy= 0.996
Time to process epoch =  246
Loading stats from file
Saving stats to file
 
Starting epoch 14
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0015 cost= 0.000151413  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 15
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0016 cost= 0.000127863  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 16
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0017 cost= 0.000108157  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 17
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0018 cost= 0.000091596  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 18
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0019 cost= 0.000077653  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
 
Starting epoch 19
Number of batches to process = 291
Processing batches. Not yet saving.
Done processing batches. Checking accuracy and logging epoch results
Epoch: 0020 cost= 0.000065829  validation accuracy= 0.996
Time to process epoch =  245
Loading stats from file
Saving stats to file
Optimization Finished!
Model saved in file: model-checkpoint
Accuracy: 0.938717
In [23]:
report_stats_from_file(stat_filename)        
plt.close("all")
Reporting from file to make plot
Done reporting stats

Question 4

How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)

Answer: Adam optimizer, 128 samples per batch, 20 epochs, no hyperparameters.

Question 5

What approach did you take in coming up with a solution to this problem?

Answer: I tried playing around with different architectures, but ended up spending most of my time figuring out that TensorFlow expects float32 rather than integer or float64 for one-hot encoding.


Step 3: Test a Model on New Images

Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.

You may find signnames.csv useful as it contains mappings from the class id (integer) to the actual sign name.

Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [24]:
### Load the images and plot them here.
### Feel free to use as many code cells as needed.
In [25]:
"""Load the trained model from file"""
restore_model_for_continued_work = 1

if restore_model_for_continued_work == 1:
    with tf.Session() as sess:

        saver = tf.train.import_meta_graph(model_savename+'.meta')
        saver.restore(sess, model_savename)
        all_vars = tf.trainable_variables()
In [26]:
from scipy import ndimage, misc

read_dir = '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images'
files_png = []
for file in os.listdir(read_dir):
    if 'png' in file:
        files_png.append(os.path.join(read_dir, file))
print(files_png)

images = np.zeros((len(files_png), 32, 32, 3), 'uint8')

for i in range(len(files_png)):
    image = ndimage.imread(files_png[i], mode="RGB")
    image_resized = misc.imresize(image, (32, 32))
    images[i,:,:,:] = image_resized

print(images.shape)
['/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t1.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t10.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t11.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t12.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t2.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t3.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t4.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t5.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t6.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t7.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t8.png', '/Users/blakejacquot/Desktop/temp2/Udacity_SelfDrivingCar/Term1/Project2_TrafficSignClassifier/Jupyter_work/test-images/t9.png']
(12, 32, 32, 3)
In [27]:
five_candidates = images[0:5,:,:,:]

plt.figure()
for i in range(len(five_candidates)):
    plt.subplot(100+50+i+1)
    plt.imshow(five_candidates[i])

plt.show()     
    
#plt.close("all")
In [28]:
# Preprocess the data

X_five_candidates_preproc = five_candidates
X_five_candidates_preproc = grayscale_set(X_five_candidates_preproc)
X_five_candidates_preproc = normalize_set(X_five_candidates_preproc)

"""Expand data to have extra channel in order to be compatible with TensorFlow"""
if len(X_five_candidates_preproc.shape) != 4:
    print('Expanding data')
    X_five_candidates_preproc = np.expand_dims(np.array(X_five_candidates_preproc),3)

print(X_five_candidates_preproc.shape)
Making grayscale
Normalizing data
Expanding data
(5, 32, 32, 1)

Question 6

Choose five candidate images of traffic signs and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult? It would be helpful to plot the images in the notebook.

Answer: The most difficult signs will be the ones that don't have exact equivalents in the trained traffic sign database (e.g. images 1-3).

In [29]:
### Run the predictions here.
### Feel free to use as many code cells as needed.
In [30]:
num_el = 5
predicted_classes = []

with tf.Session() as sess:
    saver = tf.train.import_meta_graph('model-checkpoint.meta')
    saver.restore(sess, 'model-checkpoint')
    for i in range(num_el):
        curr_im = X_five_candidates_preproc[i,:,:,:]
        curr_im = np.expand_dims(curr_im, axis=0)
        curr_predicted_class = sess.run(tf.argmax(logits, 1), feed_dict={x:curr_im})
        predicted_classes.append(int(curr_predicted_class))
In [31]:
print(predicted_classes)
print(' ')
for i in range(len(predicted_classes)):
    prediction = predicted_classes[i]
    curr_im_to_predict = five_candidates[i]
    position_index = class_dict_train[prediction]
    len_index = len(position_index)
    i1 = random.randrange(len_index)
    i2 = random.randrange(len_index)
    i3 = random.randrange(len_index)    
    index1 = position_index[i1]
    index2 = position_index[i2]
    index3 = position_index[i3]
    im1 = X_train[index1][:][:][:]
    im2 = X_train[index2][:][:][:]
    im3 = X_train[index3][:][:][:]
    
    print(' ')
    print('Showing predicted class and images')
    print('Top row, left is image to predict')
    print('Bottom row is 3 random images of predicted class from training set')
    print('Predicted class = ', predicted_classes[i])
    plt.figure()
    plt.subplot(231)
    plt.imshow(curr_im_to_predict)
    plt.subplot(234)
    plt.imshow(im1)
    plt.subplot(235)
    plt.imshow(im2)
    plt.subplot(236)
    plt.imshow(im3)
    plt.show() 

    
    
    
    
    
[13, 7, 40, 40, 14]
 
 
Showing predicted class and images
Top row, left is image to predict
Bottom row is 3 random images of predicted class from training set
Predicted class =  13
 
Showing predicted class and images
Top row, left is image to predict
Bottom row is 3 random images of predicted class from training set
Predicted class =  7
 
Showing predicted class and images
Top row, left is image to predict
Bottom row is 3 random images of predicted class from training set
Predicted class =  40
 
Showing predicted class and images
Top row, left is image to predict
Bottom row is 3 random images of predicted class from training set
Predicted class =  40
 
Showing predicted class and images
Top row, left is image to predict
Bottom row is 3 random images of predicted class from training set
Predicted class =  14

Question 7

Is your model able to perform equally well on captured pictures or a live camera stream when compared to testing on the dataset?

Answer: 0: Correct.
1: It incorrectly identified the '50' sign, though it did get the shape and that there was a number.
2: Fail, except for figuring out the sign was a circle in shape.
3: Fail, except for figuring out the sign was a circle in shape.
4: Correct.

In [32]:
### Visualize the softmax probabilities here.
### Feel free to use as many code cells as needed.

top_classes = []
with tf.Session() as sess:
    saver = tf.train.import_meta_graph('model-checkpoint.meta')
    saver.restore(sess, 'model-checkpoint')
    for i in range(num_el):
        curr_im = X_five_candidates_preproc[i,:,:,:]
        curr_im = np.expand_dims(curr_im, axis=0)
        top_classes = sess.run(tf.nn.top_k(tf.nn.softmax(logits),k=5),feed_dict={x:curr_im})
        print(top_classes)
        print(' ')
        
        
TopKV2(values=array([[  9.98852015e-01,   4.85258730e-04,   2.00543742e-04,
          1.64562109e-04,   9.48647721e-05]], dtype=float32), indices=array([[13, 32, 36, 12,  3]], dtype=int32))
 
TopKV2(values=array([[  7.85672128e-01,   2.09298968e-01,   2.61847116e-03,
          1.07635534e-03,   4.30640473e-04]], dtype=float32), indices=array([[ 7, 40, 16,  1, 12]], dtype=int32))
 
TopKV2(values=array([[ 0.47229478,  0.3432824 ,  0.08301234,  0.06485683,  0.02218555]], dtype=float32), indices=array([[40,  1,  4, 15, 33]], dtype=int32))
 
TopKV2(values=array([[  9.79404569e-01,   8.53085984e-03,   6.48835767e-03,
          4.79911687e-03,   5.36326319e-04]], dtype=float32), indices=array([[40, 12,  2, 10, 37]], dtype=int32))
 
TopKV2(values=array([[  9.98996913e-01,   3.39900114e-04,   3.35413264e-04,
          2.70561810e-04,   3.64033658e-05]], dtype=float32), indices=array([[14, 34,  4, 13, 38]], dtype=int32))
 

Question 8

Use the model's softmax probabilities to visualize the certainty of its predictions, tf.nn.top_k could prove helpful here. Which predictions is the model certain of? Uncertain? If the model was incorrect in its initial prediction, does the correct prediction appear in the top k? (k should be 5 at most)

Answer:

0: Very certain, and it correctly idendified the sign.
1: 78% certain. It got the sign mostly right.
2: 47% certain. It failed, so this is about right.
3: 97% certain. Since it failed, this is not good.
4: Very certain, and it correctly idendified the sign.

Question 9

If necessary, provide documentation for how an interface was built for your model to load and classify newly-acquired images.

Answer: N/A

Note: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to \n", "File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.